Implementing integrated hierarchical sampling in R

Lauren O’Brien

Manaaki Whenua - Landcare Research

2022-08-31

Background

  • Sampling for S-Map
  • Need to locate landscape-typical sites during initial mapping phase

Integrated Hierarchical Sampling (IHS)

  • builds on ‘purposive sampling’
  • Zhu et al. (2008) - 10.1007/978-1-4020-8592-5 - cluster over covariates, sample where membership is high
  • This time: fuzzy c-means clustering over multiple values of c
  • high membership across more iterations = more regionally typical
  • high membership on larger cluster numbers only = more local/niche

Implementations - local

  • Yang et al. (2013) - 10.1080/13658816.2012.658053

    • watershed scale, loessy farmland, northern China, previously unmapped
    • only clustered on slope, curvatures, TWI
    • validated with grid + ad-hoc + one transect

Implementations - regional

  • Yang et al. (2017) - 10.1016/S1002-0160(17)60322-9

    • regional scale, multiple land uses and mixed terrain
    • stratified on geology map, added rainfall and temp to covariates
    • grid validation - middling results, but very low n

Evaluations

  • Yang et al. (2016) - 10.2136/sssaj2015.08.0285

    • 2 desktop exercises, IHS head to head with cLHS and SRS
    • appears to outperform cLHS at watershed scale (~1:24k)
    • more stable across iterations, potentially lower n required

Software

  • SoLIM
  • GUI-based, closed source
  • aged/retired
  • ☹️

R version

  • Quarto documents - full workflow, current R stack (sf/terra, etc)
  • Reverse-engineered and verified against SoLIM software outputs
CHECK_007_RvSolim.qmd
DSM_COP_20220831_Obrienl_IHS_in_R.qmd
PROCESS_003_calculate-covariates.qmd
PROCESS_004_cluster_statistics.qmd
PROCESS_005_map-clusters.qmd
PROCESS_006_choose-samples.qmd
RES_2022_IHS.qmd
SETUP_001_get-DEM.qmd
SETUP_002_clean-DEM.qmd
library(tidyverse) # wranglin'
library(terra)     # raster 
library(sf)        # vector
library(whitebox)  # geospatial analytics
library(e1071)     # clustering functions
library(fclust)    # more clustering functions

library(leaflet)   # pretty web maps
library(scico)     # pretty colours
library(flextable) # pretty tables

Local testing

Where to from here?

  • Assessment
    • clearer tests vs other methods
    • incorporate wider range of covariates
    • utility for validation stage?
  • Refinement
    • geocmeans package - spatially aware clustering
    • other image segmentation methods e.g. motif
  • Remix
    • examine legacy samples for IHS typicality

    • Zhang et al. (2022) - cluster over different covariate mixes at one optimal c each, then IHS - optimising for multiple soil properties

References

Yang, L., Qi, F., Zhu, A.X., Shi, J., An, Y., 2016. Evaluation of integrative hierarchical stepwise sampling for digital soil mapping. Soil Science Society of America Journal 80, 637–651. https://doi.org/10.2136/sssaj2015.08.0285
Yang, L., Zhu, A.X., Qi, F., Qin, C.Z., Li, B., Pei, T., 2013. An integrative hierarchical stepwise sampling strategy for spatial sampling and its application in digital soil mapping. International Journal of Geographical Information Science 27, 1–23. https://doi.org/10.1080/13658816.2012.658053
Yang, L., Zhu, A.X., Zhao, Y., Li, D., Zhang, G., Zhang, S., Band, L.E., 2017. Regional soil mapping using multi-grade representative sampling and a fuzzy membership-based mapping approach. Pedosphere 27, 344–357. https://doi.org/10.1016/S1002-0160(17)60322-9
Zhang, L., Yang, L., Cai, Y., Huang, H., Shi, J., Zhou, C., 2022. A multiple soil properties oriented representative sampling strategy for digital soil mapping. Geoderma 406, 115531. https://doi.org/10.1016/j.geoderma.2021.115531
Zhu, A.X., Yang, L., Li, B., Qin, C., English, E., Burt, J.E., Zhou, C., 2008. Purposive sampling for digital soil mapping for areas with limited data, in: Hartemink, A.E., McBratney, A.B., Lourdes Mendonça-Santos, M. de (Eds.), Digital Soil Mapping with Limited Data. Springer, pp. 233–245. https://doi.org/10.1007/978-1-4020-8592-5